Casting votes of antecedents play a key role in successful sequential decision-making

Aggregation of opinions often results in high decision-making accuracy, owing to the collective intelligence effect. Studies on group decisions have examined the optimum weights for opinion aggregation to maximise accuracy. In addition to the optimum weights of opinions, the impact of the correlation among opinions on collective intelligence is a major issue in collective decision-making. We investigated how individuals should weigh the opinions of others and their own to maximise their accuracy in sequential decision-making. In our sequential decision-making model, each person makes a primary choice, observes his/her predecessors’ opinions, and makes a final choice, which results in the person’s answer correlating with those of others. We developed an algorithm to find casting voters whose primary choices are determinative of their answers and revealed that decision accuracy is maximised by considering only the abilities of the preceding casting voters. We also found that for individuals with heterogeneous abilities, the order of decision-making has a significant impact on the correlation between their answers and their accuracies. This could lead to a counter-intuitive phenomenon whereby, in sequential decision-making, respondents are, on average, more accurate when less reliable individuals answer earlier and more reliable individuals answer later.


S.1.1 General case
Optimum weights to maximize conditional performance We assume N individuals, individuals 1, 2,..., N , and consider two sets S and T . Set In summary, when n∈S r * n > n∈T r * n , the weights should always satisfy W (S) > W (T ) to ensure that s is the outcome of the weighted majority vote with a probability of 1. A similar discussion can also be applied to the case of L(S) < L(T ), i.e., n∈S r * n < n∈T r * n .
Thus, given the opinion distribution (S, T ) of votes, the condition that weights should be satisfied to maximise the conditional performance of the weighted majority vote can be summarised as follows. Weights should satisfy the following: A) W (S) > W (T ) when the likelihood of s being correct is greater than that of t, i.e. n∈S r * n > n∈T r * n , and B) W (S) < W (T ) when the likelihood of t being correct is greater than that of s, i.e. n∈S r * n < n∈T r * n .
When all individuals' votes are the same, that is, either S or T is empty, the outcome should be the alternative chosen by them because N n=1 p n > N n=1 (1 − p n ) is always satisfied as p n > 0.5, and the likelihood that their consensus is correct is greater than that of the opposite alternative. In fact, their consensus always becomes the outcome of a weighted majority vote, because N n=1 w n = 1 > 0.5. Therefore, by defining P (X) = Q(X) = 1 for an empty set X, we can also determine the optimum weights to maximise the conditional performance given (S, T ) where either S or T is empty according to A) and B).
Based on the discussion thus far, the conditional performance given (S, T ), which is maximised by assigning weights as the rules of A) and B), can be written as

Geometrical interpretation of condition for optimum weights
Here, we provide a geometrical interpretation of the condition of weights for maximising the conditional performance of the weighted majority vote given the opinion distribution (S, T ). We consider (N − 1)-simplex as and the hyperplane is determined by the equation W (S) = W (T ) = 0.5. Given the opinion distribution (S, T ), by comparing which is greater between n∈S r * n and n∈T r * n , we can determine the domain D (S,T ) of weights that maximise the conditional performance according to the rules of A) and B). D (S,T ) is one of the two domains in an (N − 1) simplex divided by the hyperplane determined by W (S) = W (T ): Specifically, D (S,T ) = {w; N n=1 w n = 1} (the entire region of (N − 1))-simplex) when S or T is empty, that is, when all individuals vote for the same alternative.
Given individual abilities, the domain of weights that maximise conditional performance for any opinion distribution (S, T ), (S,T ) D S,T , is not empty because (S,T ) D S,T includes the point of r * /Z = (r * 1 /Z, r * 2 /Z, ..., r * N /Z), where Z = N n=1 r * n . The weight r * /Z satisfies conditions A) and B) for any separation (S, T ). Indeed, it is well known that the relative log-odds ratio of an individual's ability gives the optimum weight to maximise the accuracy of the results of the weighted majority vote [1,2,4]. As D S,T is an open set for each (S, T ), the intersection (S,T ) D S,T is not a point but an open set, that is, points other than r * /Z are in (S,T ) D S,T . Shapley and Grofman (1984) also noted that the optimum weight of an individual with a certain ability is not unique and can take values other than the relative log-odds ratio [2]. However, to the best of our knowledge, no study has geometrically summarised the region of optimum weights to maximise the conditional performance for any opinion distribution (S, T ) given individuals' abilities, as shown in this section.
The weights in the domain (S,T ) D S,T including r * /Z also maximise the mean of the conditional performance over all possible opinion distributions (S, T ), which is called the mean performance. The mean performance of the weighted majority vote with weights in (S,T ) D S,T can be calculated as follows: It is also shown by Eq. (S.9) that the weights should be in (S,T ) D S,T to maximise mean performance.

Expert rule
In the case in which abilities satisfy r * n > m =n r * m , the likelihood of the alternative chosen by individual n being correct is always greater than that of the opposite one for any opinion distribution. Therefore, for any opinion distribution, the sum of the weights of individuals whose votes are the same as that of individual n is always greater than 0.5.
Thus, (S,T ) D S,T is w; w n > 0.5, N n=1 w n = 1 . The condition w n > 0.5 means that the outcome of the weighted majority vote is always the same as the vote by individual n.
The resulting mean performance of the weighted majority vote is then only the ability p n of individual n. Let us call the collective decision-making method expert rule governed by individual n when the outcome is always the same as the vote of only a single individual, regardless of separation (S, T ).

S.1.2 Two individuals
We apply the calculation shown in Section S.1.1 to the case of simultaneous decisionmaking by two individuals.
When two individuals' votes are the same, the outcome of the weighted majority vote should be the alternative chosen by them to maximise the conditional performance, given the distribution of votes. In this case, where S or T is empty, the domain of optimum If their primary choices are different from each other, according to rules A) and B) in the previous section, w 1 > w 2 should be satisfied, and D S, to maximise the conditional performance, given two different votes when r * 1 > r * 2 (i.e. p 1 > p 2 ). Similarly, w 1 < w 2 should be satisfied and D S,T = {w; w 1 < w 2 , w 1 + w 2 = 1} when r * 1 < r * 2 (i.e. p 1 < p 2 ).
Therefore, in the case of two individuals, the domain of weights (S,T ) D S,T that maximises the mean performance and the mean performance on such optimum weights is summarised as follows: The mean performance is • If p 1 < p 2 , (S,T ) D S,T = {w; w 1 > w 2 , w 1 + w 2 = 1}. The mean performance is The mean performance is calculated using Eq. (S.9). The first case can be interpreted as follows: When the ability of individual 1 is greater than that of individual 2, the weights should satisfy w 1 > 0.5 to maximise the mean performance. Therefore, the expert rule governed by individual 1 is optimal when p 1 > p 2 . Similarly, in the second case, the expert rule governed by individual 2 is optimal when the ability of individual 2 is greater than that of individual 1.

S.1.3 Three individuals
We apply the calculation of optimum weights shown in Section S.
where the mean performance is calculated according to Eq. (S.9). The first case can be interpreted as follows. Individual 1 is so excellent that r * 1 > r * 2 + r * 3 is satisfied, and the optimum weights in this case should satisfy w 1 > w 2 + w 3 , i.e. w 1 > 0.5. Therefore, the expert rule governed by individual 1 is the optimal decision-making method. The resulting mean performance is indeed just his/her ability p 1 . Similarly, the second and third cases mean that individuals 2 and 3 are so excellent that the expert rule governed by individuals 2 or 3 is optimal, respectively. In the fourth case, none of them is excellent, or none of the log-odds ratios exceeds the sum of the other two's. Any weight should be less than 0.5 in this case. Whenever the set of weights holds such inequalities, the outcome of the weighted majority vote is equal to that of the unweighted majority vote, because the sum of the two weights is always greater than the other one. In this case, the mean performance becomes M , which is the accuracy of the result of the unweighted majority vote among these three individuals; M is the sum of the probabilities that two or three, i.e., the majority of the primary choices, are correct.
A summary of the optimum weights and mean performance is also shown in Fig. 1 in the main text or  in simultaneous decision-making involving three individuals. The largest triangle exhibits the 2-simplex {w = (w 1 , w 2 , w 3 ) ; w 1 + w 2 + w 3 = 1}, where W 1 , W 2 , and W 3 denote (1, 0, 0), (0, 1, 0), and (0, 0, 1), respectively. Each smaller triangle exhibits the region of optimum weights when each inequality for r * 1 , r * 2 , and r * 3 holds. For example, when the ability of individual 1 is so high that r * 1 > r * 2 + r * 3 is satisfied, the optimum weights should satisfy w 1 > 0.5, i.e. w 2 + w 3 < 0.5,, which corresponds to the upper red triangle, and these weights correspond to the expert rule governed by individual 1.

S.2 Sequential decision-making
Hereafter, we consider sequential decision-making, where N persons sequentially answer a binary choice problem, as explained in Section 2.1 in the main text. The primary choice of the n-th respondent is denoted by X n . He/she can observe the answers (Y 1 , ..., Y n−1 ) given by his/her antecedents and holds an optimally weighted majority vote among Y 1 , ..., Y n−1 and X n to maximise his/her conditional performance, given the answers of these antecedents and his/her primary choice, where Y m is the answer of the m-th respondent. The n-th respondent finally answers Y n , which is the outcome of the weighted majority vote. The ability of the n-th respondent is denoted by p n .

S.2.1 Casting vote theorem and optimal behaviour
Here, we show the derivation of the optimal behaviour for each respondent to maximise his/her conditional performance, given his/her antecedents' answers and primary choice (Section 3.2 in the main text). We also provide a proof of the casting vote theorem in Section 3.1 in the main text. As described in the main text, we call the primary choice of a respondent casting vote when his/her answer is determined only by his/her primary choice, given an opinion distribution of the antecedents' answers. In addition, we call a respondent a casting voter when her primary choice is a casting vote.

First respondent's decision-making
The first respondent is always a casting voter because he/she decides independently by definition. Hereafter, let us denote the alternative answered by the first respondent as s and the opposite answer as t. We consider three subsets S n , T n and R n of the set {1, 2, ..., n}, where S n is a set of indices of respondents (from the first to the n-th respondents) whose primary choices were casting votes and who answered s; T n is a set of indices whose primary choices were casting votes and who answered t; and R n is a set of indices whose primary choices were not casting votes. Thus, {1} is defined as S 1 . Both T 1 and R 1 are empty.
The n-th respondent Subsequently, we consider decision-making by the n-th respondent (n > 1) in each of the following two cases: Case 1) the indices of her antecedents are separated into S n−1 and T n−1 , and R n−1 = φ; Case 2) the antecedents' indices are separated into S n−1 , T n−1 , and R n−1 ( = φ).
Case 1) Antecedents' indices are separated into S n−1 and T n−1 First, we assume that the indices of the n-th respondents' antecedents are separated only by S n−1 and T n−1 (S n−1 T n−1 = {1, 2, ..., n − 1}, R n−1 = φ). The second respondent always faces this situation. As all the antecedents of the n-th respondent are casting voters, Y 1 , Y 2 , ..., Y n−2 and Y n−1 are independent of each other. In addition, the primary choice X n of the n-th respondent is independent of the answers of his/her antecedents.
Therefore, for the n-th respondent, we can calculate the probability of having the answers and primary choice (Y 1 , ..., Y n−1 , X n ) when one alternative is correct, which is also regarded as the likelihood that the alternative is correct, given this opinion distribution, in the same manner as that for simultaneous decision-making in Section S.1.
When the primary choice of the n-th respondent is s, the likelihood of s being cor- Based on the same discussion as that for simultaneous decision-making (Section S.1), the n-th respondent should answer s with probability 1 if maximise the conditional performance given these answers and the primary choice. By contrast, he/she should answer t with probability 1 if r * n + m∈S n−1 r * m < m∈T n−1 r * m .
The conditional performance π n (S n−1 , T n−1 , X n = s) of the n-th respondent is Similarly, when X n = t, the likelihood of s being correct given ( The n-th respondent should answer s with probability 1 if m∈S n−1 r * m > r * n + m∈T n−1 r * m and answer t with probability 1 if m∈S n−1 r * m < r * n + m∈T n−1 r * m to maximise the conditional performance given the answers and primary choice. The resulting conditional performance Here, we summarise the optimal behaviour of the n-th respondent given the opinion distribution (S n−1 , T n−1 ) of the antecedents based on the relationship between the abilities of the antecedents and the respondent; more precisely, based on the relationship between r * n + m∈S n−1 r * m and m∈T n−1 r * m or m∈S n−1 r * m and r * n + m∈T n−1 r * m , as in the following four cases: In this case, the n-th respondent answers s with probability 1 when X n = s, and answers t with probability 1 when X n = t, as his/her optimal behaviour (see rules A) and B) in Section S.1.1). Therefore, the respondent should answer s with probability 1 irrespective of his/her primary choice. Thus, his/her primary choice is not a casting vote, and n is included in R n (R n = R n−1 ∪{n}). The probability of finally observing the answers by the first to the n-th respondents can be written as • Q(S n−1 )P (T n−1 ) when t is correct.

Case of
By a similar discussion of the first case, the n-th respondent answers t with probability 1 irrespective of his/her primary choice, as his/her optimal behaviour (see rules A) and B) in Section S.1.1). Therefore, his/her primary choice is not the casting vote (R n = R n−1 ∪ {n}). The probability of finally observing a set of answers by the first to the n-th respondents can be represented as • P (S n−1 )Q(T n−1 ) when s was correct.

Case of
(S.14) The n-th respondent answers s when her primary choice is s and answers t when her primary choice is t, as her optimal behaviour (see rules A) and B) in Section S.1.1).
This means that her answer is her primary choice. Therefore, her primary choice is a casting vote. Thus, n is included in S n or T n according to her primary choice. Given the opinion distribution (S n−1 , T n−1 ) of answers by the antecedents, the conditional probability that the n-th respondent answers s is p n and that she answers t is 1 − p n when s is correct. When t is correct, the conditional probability of answering s is 1 − p n and that of answering t is p n . Therefore, the probability of finally observing a set of answers by the first to the n-th respondents can be written as • P (S n−1 )Q(T n−1 )p n when X n = s and s was correct.
• P (S n−1 )Q(T n−1 )(1 − p n ) when X n = t and s was correct.
• Q(S n−1 )P (T n−1 )(1 − p n ) when X n = s and t was correct.
• Q(S n−1 )P (T n−1 )p n when X n = t and t was correct.

Case of
cannot occur as p n > 0.5 and r * n > 0.
Here, the conditions of r * n + m∈S n−1 r * m > m∈T n−1 r * m and m∈S n−1 r * m > r * n + m∈T n−1 r * m in the first case can be reduced to only m∈S n−1 r * m > r * n + m∈T n−1 r * m because the former inequality is always satisfied when the latter inequality is satisfied. Similarly, in the second case, the conditions of r * n + m∈S n−1 r * m < m∈T n−1 r * m and m∈S n−1 r * m < r * n + m∈T n−1 r * m can be reduced to only r * n + m∈S n−1 r * m < m∈T n−1 r * m . Under the third case, r * n + m∈S n−1 r * m > m∈T n−1 r * m and m∈S n−1 r * m < r * n + m∈T n−1 r * m means This summary of optimal behaviour according to the relationship between the respondents' abilities is incorporated into Table 1 in the main text (Section 3.2).
In summary, the n-th respondent should either answer regardless of his/her primary choice or always answer his/her primary choice, and n is assigned to either R n , S n , or T n given that the respondent's antecedents are separated into S n−1 and T n−1 . The probability of finally observing the answers of the first to the n-th respondents can also be summarised as follows. When s is correct, this probability is expressed as When t is correct, this probability is expressed as Case 2) Antecedents are separated into S n−1 , T n−1 and R n−1 Thus far, we have shown that by the optimal behaviour of the m-th respondent, m can be assigned to either S m , T m , or R m provided that the antecedents' indices are separated into either S m−1 or T m−1 (R m−1 = φ). We now consider decision-making by the n-th respondent when the antecedents' indices are separated into S n−1 , T n−1 , and R n−1 ( = φ).
From the discussion in Case 1, the probability of observing the answers by the antecedents given the separation of (S n−1 , T n−1 , R n−1 ) can be represented as P (S n−1 )Q(T n−1 ) m∈R n−1 1 = P (S n−1 )Q(T n−1 ) when s is correct and Q(S n−1 )P (T n−1 ) m∈R n−1 1 = Q(S n−1 )P (T n−1 ) when t is correct. Therefore, given the opinion distribution (S n−1 , T n−1 , R n−1 ), the probability of observing the answers by the antecedents can be written only by the abilities of antecedents in S n−1 and T n−1 as well as that in Case 1. The optimal behaviour for the n-th respondent in Case 2 is the same as that shown in Case 1 if we substitute π n (S n−1 , T n−1 , X n = s), π n (S n−1 , T n−1 , X n = t), Z S n−1 ,T n−1 ,Xn=s and Z S n−1 ,T n−1 ,Xn=t to π n (S n−1 , T n−1 , R n−1 , X n = s), π n (S n−1 , T n−1 , R n−1 , X n = t), Z S n−1 ,T n−1 ,R n−1 ,Xn=s and Z S n−1 ,T n−1 ,R n−1 ,Xn=t , respectively. The probability of finally observing the answers of the first to the n-th respondents can be represented as follows in the same manner as in Case 1: • P (S n−1 )Q(T n−1 )p n when X n = s and s was correct.
• P (S n−1 )Q(T n−1 )(1 − p n ) when X n = t and s was correct.
• Q(S n−1 )P (T n−1 )(1 − p n ) when X n = s and t was correct.
• Q(S n−1 )P (T n−1 )p n when X n = t and t was correct.
Finally, for the n-th respondent, the mean performance E[π n (S n−1 , T n−1 , R n−1 , X n )] over the possible opinion distribution (S n−1 , T n−1 , R n−1 ) of the antecedents and primary choice X n can be written as:

S.2.2 Sequential decision-making of three individuals
Here, we show in detail the optimal behaviour of the second and third respondents in sequential decision-making, as shown in Section 3.3 in the main text.

Optimal behaviour of the second respondent
First, we considered decision-making by the second respondent. As his/her primary choice X 2 is independent of the answer Y 1 by the first respondent, his/her optimal behaviour can be determined by applying the calculation of that in the case of simultaneous decision-making involving two individuals (Section S.1.2). When p 1 > p 2 , the expert rule governed by the answer of the first respondent is optimal in the decision-making of the second respondent. Therefore, Y 2 should always be the same as Y 1 and 2 ∈ R 2 . When p 1 < p 2 , the respondent should always give his/her primary choice X 2 as the answer, i.e., he/she should be a casting voter, and 2 is included in either S 2 or T 2 according to his/her primary choice.

Optimal behaviour of the third respondent
Subsequently, we considered decision-making by the third respondent. When p 1 < p 2 , the second respondent is a casting voter (2 ∈ S 2 or 2 ∈ T 2 ). Therefore, Y 1 , Y 2 and X 3 are independent of one another. Thus, when p 1 < p 2 , the optimal behaviour of the third respondent can be determined in the same manner as that for the simultaneous decisionmaking involving three individuals (Section S.1.3). Hereafter, we determine the optimal behaviour of the third respondent when p 1 > p 2 and 2 ∈ R.
When the primary choice X 3 of the third respondent is s, which is the same as the first choice, the third respondent should always answer s because r * 1 + r * 3 > 0. We do not have to consider r * 2 because 2 ∈ R 2 . Based on this optimal behaviour, the conditional performance π 3 (S 2 , T 2 , R 2 , X 3 = s) becomes: using Eq. (S.10). When X 3 = t, the respondent should answer s if r * 1 > r * 3 , i.e., p 1 > p 3 , and answer t if r * 1 < r * 3 , i.e., p 1 < p 3 . Then, the conditional performance becomes by Eq. (S.11).
Therefore, when p 1 > p 2 , the mean performance E[π 3 (S 2 , T 2 , R 2 , X 3 )] of the third respondent can be written as which is p 1 when p 1 > p 3 and p 3 when p 1 < p 3 . In this section, we assume the sequential decision-making of individuals with the same ability p as shown in Section 3.3 of the main text.

S.2.3 Sequential decision-making of individuals
We consider decision-making by the n-th respondent. Let S n−1 and T n−1 be sets of indices of the antecedents that were casting voters and answered s and t, respectively, as explained in the previous section. In addition, let R n−1 denote the set of antecedents that are not casting voters. |X| denotes the number of elements in the set X.
We can use the algorithm shown in Table 1  The optimum answer Y n for the n-th respondent can be derived, as shown in Table 1 in the main text. When X n = s, and, when X n = t, Figure S.2: Number of correct and wrong answers in the sequential decision-making involving five individuals. The pair (x, y) at the m-th column from the left-hand side exhibits the number of correct (x) and wrong (y) answers in the answers given by the antecedents of the m-th respondent. Transition probability between two states of answers is shown on the line between two brackets. The probability that the sixth respondent will observe each state of answers is shown on the right-hand side.
fourth respondent do not answer incorrectly.
Let us first assume that n is odd, i.e. n = 2k−1 (k = 1, 2, ...). We derive the conditional and mean performance of the n-th respondent in this case. The conditional performance of the n-th respondent given (S n−1 , T n−1 , R n−1 ) in the answers of his/her antecedents and primary choice, π n (S n−1 , T n−1 , R n−1 , X n ), is described as follows. When s is correct, • if d n = |S n−1 | − |T n−1 | = 2 (|T n−1 | = 0, 1, ..., k − 2), the probability of observing the answers given by the antecedents of the n-th respondent is [2p(1 − p)] |T n−1 | p 2 .
The respondent answers s regardless of his/her primary choice as his/her optimal  behaviour and conditional performance π n (S n−1 , T n−1 , R n−1 , X n ) is 1.
Here, d n cannot be 1 when n is odd, for the following reason. Now, |S n−1 | + |T n−1 | + |R n−1 | is even as n = |S n−1 |+|T n−1 |+|R n−1 |+1 is odd. If d n was 1, R n−1 should be empty because d n < 2 as discussed previously. Therefore, |S n−1 | + |T n−1 | equals |S n−1 | + |T n−1 | + |R n−1 | and is even. Thus, d n = |S n−1 | − |T n−1 | should be even, which contradicts d n = 1. When t is correct, , the probability of observing the answers given by the antecedents of the n-th respondent is The respondent answers s regardless of his/her primary choice as his/her optimal behaviour, and π n (S n−1 , T n−1 , R n−1 , X n ) = 0.
• If d n = 0, i.e., |S n−1 | = |T n−1 | = k − 1, the probability of observing the answers given by the antecedents of the n-th respondent is [2p(1 − p)] k−1 . His/her optimal behaviour is giving his/her primary choice as the answer, and π n (S n−1 , T n−1 , R n−1 , X n ) = p.
We further investigate the mean performance of the n-th respondent as n is odd. The probability that s is correct is p and that t is correct is 1 − p because the ability of the first respondent is p. Therefore, the mean of the conditional performance E[π n ] := E[π n (S n−1 , T n−1 , R n−1 , X n )] over the possible (S n−1 , T n−1 , R n−1 , X n ), simply called the mean performance, is The n-th respondent then unconditionally answers t, which is correct. Finally, if d n is either −1, 0, or 1, the probability of which is Q n {−1,0,1} , then the n-th respondent answers his/her primary choice as he/she is a casting voter. The accuracy of his/her answer is therefore p, which gives the second term of (S.26).
Subsequently, we assume that n is even, i.e. n = 2k (k = 1, 2, ...). In this case, |S n−1 | + |T n−1 | is odd, and d n cannot be 0 by a similar discussion for the case where n is odd. Here, we must consider the conditional performance of the n-th respondent when |d n | = 1, which is not the case when n is even. When s was correct, • if |d n | = 2, the conditional performance of the n-th respondent π n (S n−1 , T n−1 , R n−1 , X n ) given opinion distribution (S n−1 , T n−1 , R n−1 ) among his/her antecedents, and his/her primary choice is the same as that in the case where n is odd.
, the probability of observing the answers given by the antecedents of the n-th respondent is [2p(1 − p)] k−1 p. His/her optimal behaviour is giving his/her primary choice as her answer, and π n (S n−1 , T n−1 , R n−1 , X n ) = p.
His/her optimal behaviour is giving his/her primary choice as the answer, and π n (S n−1 , T n−1 , R n−1 , X n ) = p.
When t was correct, • if |d n | = 2, the conditional performance is the same as that in the case where n is odd.
His/her optimal behaviour is giving his/her primary choice as the answer, and π n (S n−1 , T n−1 , R n−1 , X n ) = p.
His/her optimal behaviour is giving his/her primary choice as the answer, and π n (S n−1 , T n−1 , R n−1 , X n ) = p.
The mean performance E[π n ] of the n-th respondent as n is even is which is the same as in Eq. (S.26).
In summary, for both n = 2m − 1 and n = 2m (m ≥ 1), the mean performance of the n-th respondent is (S.28) The Here, we assume an individual who has a higher ability q(> p) than the others, called an expert, in the sequential decision-making of individuals with the same ability p discussed so far. The expert is assumed to make a decision at the n-th earliest (n ≥ 3), and the mean performance is denoted by E[π n,q ], as explained in the main text.
First, we assume that the expert knows his/her ability q. The optimal behaviour of the expert can be summarised according to the difference d n between |S n−1 | and |T n−1 | in his/her antecedents as follows: • In the case of d n = 2, the expert answers s regardless of his/her primary choice as the optimal behaviour if |S n−1 |r * > |T n−1 |r * + log [q/(1 − q)], which is equivalent to q < e 2r * / 1 + e 2r * = p 2 / [p 2 + (1 − p) 2 ] = π max (p), by Table 1 in the main text.
Otherwise, his/her optimal behaviour is giving his/her primary choice as the answer.
• In the case of |d n | = 1 or |d n | = 0, his/her optimal behaviour is giving his/her primary choice as the answer because q is greater than p and he/she should be a casting-voter even if his/her ability were p(< q) in these cases.
• In the case of d n = −2, by a similar discussion to that in the case of d n = 2, the expert answers t regardless of his/her primary choice as the optimal behaviour if q < π max (p). Otherwise, his/her optimal behaviour is giving his/her primary choice as the answer.
In summary, when q > π max (p), the expert should be a casting voter, and his/her conditional performance and mean performance are q. When q < π max (p), the expert's ability is not so high; therefore, he/she should always give the majority alternative out of the opinions expressed by his/her antecedents if |d n | = 2 and should be a casting voter if |d n | is 1 or 0; this optimal behaviour is the same as that in the case where the n-th respondent has the ability p. The mean performance E [π n,q ] of the expert when q < π max (p) is where n = 2k − 1 or n = 2k (k = 1, 2, ...).
Therefore, the mean performance E [π n,q ] of the n-th respondent (n = 2k −1 or n = 2k) with ability q who knows his/her ability, is summarised as follows: Subsequently, we assume that the expert is not aware of his/her superiority to others in terms of ability and believes that her ability is the same as the others, p. In this case, the expert takes on optimal behaviour for a respondent whose ability is p. Therefore, his/her answer will be the opinion shared by the majority of antecedents when |d n | = 2, and he/she should always give her primary choice as the answer when |d n | is 1 or 0. Thus, the mean performance of the expert who believes that his/her ability is the same as that of the others can be written as follows: where n = 2k − 1 or n = 2k (k = 1, 2, ...).

S.2.3.3 Effective size
We evaluated the sequential decision-making of individuals with ability p by its effective number of voters, which is defined as follows. Let us denote the mean performance of the where n e is odd, as generally assumed in the literature on collective intelligence [5].